179 research outputs found
Fast Scalable Construction of (Minimal Perfect Hash) Functions
Recent advances in random linear systems on finite fields have paved the way
for the construction of constant-time data structures representing static
functions and minimal perfect hash functions using less space with respect to
existing techniques. The main obstruction for any practical application of
these results is the cubic-time Gaussian elimination required to solve these
linear systems: despite they can be made very small, the computation is still
too slow to be feasible.
In this paper we describe in detail a number of heuristics and programming
techniques to speed up the resolution of these systems by several orders of
magnitude, making the overall construction competitive with the standard and
widely used MWHC technique, which is based on hypergraph peeling. In
particular, we introduce broadword programming techniques for fast equation
manipulation and a lazy Gaussian elimination algorithm. We also describe a
number of technical improvements to the data structure which further reduce
space usage and improve lookup speed.
Our implementation of these techniques yields a minimal perfect hash function
data structure occupying 2.24 bits per element, compared to 2.68 for MWHC-based
ones, and a static function data structure which reduces the multiplicative
overhead from 1.23 to 1.03
On the probability of rendezvous in graphs
In a simple graph without isolated nodes the following random experiment is carried out: each node chooses one of its neighbors uniformly at random. We say a rendezvous occurs if there are adjacent nodes and such that chooses and chooses ; the probability that this happens is denoted by . M{\'e}tivier \emph{et al.} (2000) asked whether it is true that for all -node graphs , where is the complete graph on nodes. We show that this is the case. Moreover, we show that evaluating for a given graph is a \numberP-complete problem, even if only -regular graphs are considered, for any
Quicksort, Largest Bucket, and Min-Wise Hashing with Limited Independence
Randomized algorithms and data structures are often analyzed under the
assumption of access to a perfect source of randomness. The most fundamental
metric used to measure how "random" a hash function or a random number
generator is, is its independence: a sequence of random variables is said to be
-independent if every variable is uniform and every size subset is
independent. In this paper we consider three classic algorithms under limited
independence. We provide new bounds for randomized quicksort, min-wise hashing
and largest bucket size under limited independence. Our results can be
summarized as follows.
-Randomized quicksort. When pivot elements are computed using a
-independent hash function, Karloff and Raghavan, J.ACM'93 showed expected worst-case running time for a special version of quicksort.
We improve upon this, showing that the same running time is achieved with only
-independence.
-Min-wise hashing. For a set , consider the probability of a particular
element being mapped to the smallest hash value. It is known that
-independence implies the optimal probability . Broder et al.,
STOC'98 showed that -independence implies it is . We show
a matching lower bound as well as new tight bounds for - and -independent
hash functions.
-Largest bucket. We consider the case where balls are distributed to
buckets using a -independent hash function and analyze the largest bucket
size. Alon et. al, STOC'97 showed that there exists a -independent hash
function implying a bucket of size . We generalize the
bound, providing a -independent family of functions that imply size .Comment: Submitted to ICALP 201
Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts
We study the approximate string matching and regular expression matching
problem for the case when the text to be searched is compressed with the
Ziv-Lempel adaptive dictionary compression schemes. We present a time-space
trade-off that leads to algorithms improving the previously known complexities
for both problems. In particular, we significantly improve the space bounds,
which in practical applications are likely to be a bottleneck
A Reconfigurations Analogue of Brooks’ Theorem
Let G be a simple undirected graph on n vertices with maximum degree Δ. Brooks’ Theorem states that G has a Δ-colouring unless G is a complete graph, or a cycle with an odd number of vertices. To recolour G is to obtain a new proper colouring by changing the colour of one vertex. We show that from a k-colouring, k > Δ, a Δ-colouring of G can be obtained by a sequence of O(n 2) recolourings using only the original k colours unless G is a complete graph or a cycle with an odd number of vertices, or k = Δ + 1, G is Δ-regular and, for each vertex v in G, no two neighbours of v are coloured alike. We use this result to study the reconfiguration graph R k (G) of the k-colourings of G. The vertex set of R k (G) is the set of all possible k-colourings of G and two colourings are adjacent if they differ on exactly one vertex. It is known that if k ≤ Δ(G), then R k (G) might not be connected and it is possible that its connected components have superpolynomial diameter, if k ≥ Δ(G) + 2, then R k (G) is connected and has diameter O(n 2). We complete this structural classification by settling the missing case: if k = Δ(G) + 1, then R k (G) consists of isolated vertices and at most one further component which has diameter O(n 2). We also describe completely the computational complexity classification of the problem of deciding whether two k-colourings of a graph G of maximum degree Δ belong to the same component of R k (G) by settling the case k = Δ(G) + 1. The problem is O(n 2) time solvable for k = 3, PSPACE-complete for 4 ≤ k ≤ Δ(G), O(n) time solvable for k = Δ(G) + 1, O(1) time solvable for k ≥ Δ(G) + 2 (the answer is always yes)
Knocking Out P_k-free Graphs
A parallel knock-out scheme for a graph proceeds in rounds in each of which each surviving vertex eliminates one of its surviving neighbours. A graph is KO-reducible if there exists such a scheme that eliminates every vertex in the graph. The Parallel Knock-Out problem is to decide whether a graph G is KO-reducible. This problem is known to be NP-complete and has been studied for several graph classes since MFCS 2004. We show that the problem is NP-complete even for split graphs, a subclass of P 5-free graphs. In contrast, our main result is that it is linear-time solvable for P 4-free graphs (cographs)
Matchings on infinite graphs
Elek and Lippner (2010) showed that the convergence of a sequence of
bounded-degree graphs implies the existence of a limit for the proportion of
vertices covered by a maximum matching. We provide a characterization of the
limiting parameter via a local recursion defined directly on the limit of the
graph sequence. Interestingly, the recursion may admit multiple solutions,
implying non-trivial long-range dependencies between the covered vertices. We
overcome this lack of correlation decay by introducing a perturbative parameter
(temperature), which we let progressively go to zero. This allows us to
uniquely identify the correct solution. In the important case where the graph
limit is a unimodular Galton-Watson tree, the recursion simplifies into a
distributional equation that can be solved explicitly, leading to a new
asymptotic formula that considerably extends the well-known one by Karp and
Sipser for Erd\"os-R\'enyi random graphs.Comment: 23 page
Forbidden Induced Subgraphs and the Price of Connectivity for Feedback Vertex Set
Let fvs(G) and cfvs(G) denote the cardinalities of a minimum feedback vertex set and a minimum connected feedback vertex set of a graph G, respectively. For a graph class G, the price of connectivity for feedback vertex set (poc-fvs) for G is defined as the maximum ratio cfvs(G)/fvs(G) over all connected graphs G in G. It is known that the poc-fvs for general graphs is unbounded. We study the poc-fvs for graph classes defined by a finite family H of forbidden induced subgraphs. We characterize exactly those finite families H for which the poc-fvs for H-free graphs is bounded by a constant. Prior to our work, such a result was only known for the case where |H|=1
Wear Minimization for Cuckoo Hashing: How Not to Throw a Lot of Eggs into One Basket
We study wear-leveling techniques for cuckoo hashing, showing that it is
possible to achieve a memory wear bound of after the
insertion of items into a table of size for a suitable constant
using cuckoo hashing. Moreover, we study our cuckoo hashing method empirically,
showing that it significantly improves on the memory wear performance for
classic cuckoo hashing and linear probing in practice.Comment: 13 pages, 1 table, 7 figures; to appear at the 13th Symposium on
Experimental Algorithms (SEA 2014
- …